† Corresponding author. E-mail:
Project supported by the National Natural Science Foundation of China (Grant No. 61471055).
The paper studies the robustness of the network in terms of the network structure. We define a strongly dominated relation between nodes and then we use the relation to merge the network. Based on that, we design a dominated clustering algorithm aiming at finding the critical nodes in the network. Furthermore, this merging process is lossless which means the original structure of the network is kept. In order to realize the visulization of the network, we also apply the lossy consolidation to the network based on detection of the community structures. Simulation results show that compared with six existed centrality algorithms, our algorithm performs better when the attack capacity is limited. The simulations also illustrate our algorithm does better in assortative scale-free networks.
A large number of complex systems can be represented as networks. The research scope of the complex network includes biology,[1,2] social network,[3,4] communication,[5,6] and so on. The robustness of the network is significative to be focused on. It has been proved that the large-scale real network is robust to the random attack while it shows vulnerability to the targeted attack. Most real networks are sparse networks. Sparse networks[7–10] are those with small number of short loops and edges. So, in complex network, we focus on the researches of sparse networks.
Generally, the network will be broken after deleting a part of nodes chosen by centrality algorithms. The existing centrality algorithms[11,12] include the degree algorithm,[13] the betweenness algorithm,[14] the closeness algorithm,[15] the k-core algorithm,[16] and so on. In real networks, the number of nodes in the network is large, but the attack capacity is limited, that is to say, we can just attack a small part of nodes in the network. Under this occasion, deleting the nodes chosen by the algorithms above has no big influence on the network, which means the existing centrality algorithms perform not so well when the attack capacity is limited. So in this paper we tend to design an effective centrality algorithm to figure out the critical nodes when the attack capacity is limited.
The process of deleting nodes can be seen as the process of percolation in the network. Latora[17] studied the vulnerability and protection of infrastructure networks in which he put forward a framework to define critical damages, critical improvements and structural vulnerability. Karrer[18] reformulated percolation as a message passing process and gave effective way to calculate the size of the percolating cluster and the average cluster size in sparse network. Agliari[19] analysed bond percolation process on correlated random network and showed different evolution of the network between stochastic and deterministic process. Azimitafreshi[20] studied k-core percolation in multilayer networks and presented an analytical framework which enables us to describe the nature of the transitions. Liu[21] figured out influential nodes via SIR model and used dynamic centrality to measure the importance of the nodes. Tian[22] focused on phase transition during percolation. The explosive percolation is found in finite networks. However, the explosive percolation will not appear after limited nodes deleted. Therefore, the existing centrality algorithms are not effective when the attack capacity is low.
In this paper, we tend to pay attention on the network structure and design a novel algorithm to measure the centrality of nodes. There are vast previous researches on network structure. Kanevsky[23] gave fast algorithm to find out minimum cut vertex set in the network from which we can merge the network iteratively. Rosvall[24] used maps of information flow to detect the community structure in the network. We can use the community structure to merge the network and the nodes which have swallowed many other nodes are considered to be important to the network. Wang[25] detected the community structure using nonnegative matrix factorization which has powerful interpretability and close relationship between clustering methods. Singh[26] found community structure in sparse network using random walks with reluctant backtracking operators. Liu[27] defined three properties of community and proposed a simple and novel framework to detect community by network topology. However, if we merge the network by just using the exiting community detection algorithm, the critical nodes perform poorly during the process of attack.
In this research we define a dominated relation between nodes based on the cut vertex in the network. Then we merge the network using the strongly dominated relation and we call this process as the lossless consolidation. We design a dominated clustering algorithm to figure out the critical nodes by the weights which can be obtained by the merging process. Simulation results show the effectiveness of our algorithm in sparse real networks when the attack capacity is limited. Our algorithm performs better in assortative scale-free network compared with other centrality algorithms.
The rest of this paper is arranged as follows. Section
The indicators for measuring the vulnerability of networks are abundant. This paper uses the Largest Connected Component (LCC) to evaluate the communication capability of the network currently, in other words, the robustness of networks. Furthermore, we use the number of components (the isolated nodes are excluded) and the scale of small components to evaluate the broken degree of the network. Then we can write down the definition of vulnerability indicators as follows: The scale of the LCC The number of connected components The mean scale of small connected components
Generally, the way to find out the critical nodes is to use node centrality algorithms. Different centrality indicators measure the nodes value from different angles. If we attack the network with different centrality algorithms, it will show different attack performances. Here we give some existing node centrality indicators. First, we introduce the degree centrality which is known as the most common used indicator.[13]
However, the existing centrality algorithms do not always perform well, especially under the limited attack capacity. Figure
In order to show the different performances of the centrality algorithms, we do simulations on the real networks and the Barabasi Albert network with different assortativities. Table
Here n is the number of nodes in the network. m is the number of edges,
We also compare performance in artificial network. BA networks are scale-free networks. Initially we have
In this paper, we tend to analyze the network using a new relation between nodes which is called as the dominated relation. For an undirected network
In the 1-cut network, node a is one of the nodes whose degree is 1. Node b is connected with a. From the theories above, we can define node a is dominated by node b, in other words, there is a dominated relation between nodes a and b. We also define node b as the minimum cut of the network. Then we can merge the dominated nodes whose degree is one. This process can be repeated several times until there is no node whose degree is 1.
Then we focus on the situation when the network is 2-cut. In this situation we can say the nodes which connect only to the minimum cut are dominated by the minimum cut of the network. However, this relation is not reliable for deciding the critical vertex. In this paper we define a strongly dominated relation between nodes. Suppose one minimum cut of the network is (a,b), the dominated nodes of (a,b) are
Using the strongly dominated relation between nodes, we can design dominated clustering algorithm (DCA) to find out the critical vertex. To find the important nodes is to find the nodes with high profit. In our algorithm, we merge nodes step by step, and use two lists to save the node information (value and size). Value is the profit of the node and size keeps the index of the other nodes merged by the node.
Generally, we suppose there are one-degree nodes in the network. So from the Lemma 1 we can gain the network is 1-cut. At the first step, we initially set the value of every node is 1, and the size of it just keep the index of itself. After that, we delete the dominated nodes and their values are added to the corresponding minimum cut nodes’ value. The value of the deleted nodes is set to be 0. According to Lemma 2, this process will be repeated until the network has no one-degree nodes.
Then we tend to deal with 2-cut situation. (a,b) are minimum cut and
During the process of dealing with 2-cut situation, the network may have new one degree nodes. We just deal with the nodes as the method of 1-cut. When the network is k-cut, we can extend our algorithm to k-cut.
At the merging step, we merge the k minimum cut nodes as 2-cut. The value is set as
Figure
Overall our algorithm is arranged as:
From formula (
In Fig.
The characteristics of our algorithm is that the algorithm focus on local structure of the network through cut nodes. We understand the network structures more clearly. The existing centralities can be divided into three classes which are neighborhood-based centralities, path-based centralities, and iterative refinement centralities. Neighborhood-based centralities are degree, k-core, H-index, and so on. These centralities focus on neighbor nodes of the central nodes. For path-based centralities, there are betweenness, closeness, and Subgraph centrality. These centralities concentrate on the shortest paths between nodes. For iterative refinement centralities, there are eigenvector, pagerank, HITs, and so on. These centralities are based on that the influence of a node is not only determined by the number of neighbor nodes, but also determined by the influence of its neighbor nodes. So in these centralities, every node gets support from its neighbor nodes. However, there is no centrality which concentrates on local dominated relations between nodes. In our algorithm, we calculate the node centrality using node mergence. This is also a characteristic of our algorithm.
After using the lossless consolidation, we can also do the lossy consolidation. This means the network will be further merged after applying our algorithm. The significance of the lossy consolidation is the visualization of the network. The real network has too many nodes to show the details of the network. Thus we want to merge the nodes first, and then the struct of the network will be easy to be shown. In lossy consolidation, firstly, we do community detection in the network after mergence. In every community, there are nodes which are connected with other nodes in other community. We call these nodes as gate nodes. In a community C,
To show the efficiency of our algorithm, we do simulation in the dataset. Figure
Table
To verify the effectiveness of the algorithm, we utilize Spyder on the python platform to achieve the DCA algorithm we proposed. We compared the simulation results among different algorithms in different networks. In the simulation we used the Package networkx. We can use nx.degree(), nx.betweenness_centrality(), nx.pagerank(), nx.closeness_centrality(), nx.eigenvector_centrality(), and nx.k_core() to calculate six centralities. In order to guarantee the accuracy of the simulation results, we expand the scale of the networks and repeat the experiments to reduce the error. The simulation results are the average of the experimental data.
In this part, we tend to analyze the correlation between the dominated clustering algorithm and existing algorithms. Table
Figure
If we just attack the network using single indicator, this approach is too naive. There have been many existing researches describing this process as Information Maximum Problem (IMP). Given network G(V, E), the target is to find a set S to maximize function f(S). In this paper, f(S) represents the degree of damage to the network which is related with LCC. Here we use a heuristic algorithm.[38] We obtain the remove nodes by adaptive recalculation, that is, to choose the node with the largest centrality at first, and then recalculate the centrality after removing the chosen node. Then we choose degree and betweenness to do this because of better performance. Greedy algorithm can also be used for choosing nodes. We choose top n nodes in every centralities as a strategy set. In every step of removing nodes, we choose the node which maximizes f(S).
In the Fig.
In this section, we pay more attention on the performances of the algorithms under real networks. Tables
To show more detailed information about our algorithm, we display the results of deleting nodes from 0 to 1 in router networks in Fig.
After that, we study the comparison between the attack performance of the dominated clustering algorithm (DCA) we proposed and that of other centrality algorithms. Different from the simulations before, this section focuses on the performance in the BA networks with different assortativity coefficients. To simplify the simulation and make the figures clearer, here, we choose the degree algorithm (DA) to represent the centrality algorithms because of its popularity and good performance.
The simulation results are shown in Fig.
In this paper, we focus on the dominated relation between nodes in the network. Based on minimum cut set of the network, we define a strongly dominated relation between nodes. After that, we can merge the network based on strongly dominated relation. Besides, the merging process is lossless. Because this process just merge the nodes which are dominated nodes. Then we design the dominated clustering algorithm using effective data structure to realize this process. With DCA we can find the critical nodes in the network. We also design an algorithm of lossy consolidation which is helpful in visulization. In simulation we compare the performance of our algorithm with other centrality algorithms. Simulation results show that when the attack capacity is limited, DCA performs best in the real networks. Furthermore, DCA is effective when the network is sparse. From simulation in scale-free network, we can obtain that DCA works better in assortative scale-free network.
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] | |
[19] | |
[20] | |
[21] | |
[22] | |
[23] | |
[24] | |
[25] | |
[26] | |
[27] | |
[28] | |
[29] | |
[30] | |
[31] | |
[32] | |
[33] | |
[34] | |
[35] | |
[36] | |
[37] | |
[38] | |
[39] |